RE-MuSiC: a tool for multiple sequence alignment with regular expression constraints
نویسندگان
چکیده
RE-MuSiC is a web-based multiple sequence alignment tool that can incorporate biological knowledge about structure, function, or conserved patterns regarding the sequences of interest. It accepts amino acid or nucleic acid sequences and a set of constraints as inputs. The constraints are pattern descriptions, instead of exact positions of fragments to be aligned together. The output is an alignment where for each pattern (constraint), an occurrence on each sequence can be found aligned together with those on the other sequences, in a manner that the overall alignment is optimized. Its predecessor, MuSiC, has been found useful by researchers since its release in 2004. However, it is noticed in applications that the pattern formulation adopted in MuSiC, namely, plain strings allowing mismatches, is not expressive and flexible enough. The constraint formulation adopted in RE-MuSiC is therefore enhanced to be regular expressions, which is convenient in expressing many biologically significant patterns like those collected in the PROSITE database, or structural consensuses that often involve variable ranges between conserved parts. Experiments demonstrate that RE-MuSiC can be used to help predict important residues and locate phylogenetically conserved structural elements. RE-MuSiC is available on-line at http://140.113.239.131/RE-MUSIC.
منابع مشابه
CSA-X: Modularized Constrained Multiple Sequence Alignment
Imposing additional constraints on multiple sequence alignment (MSA) algorithms can often produce more biologically meaningful alignments. Hence, various constrained multiple sequence alignment (CMSA) algorithms have been developed in the literature, where researchers used anchor points, regular expressions, or context-free-grammars to specify the constraints, wherein alignments produced are fo...
متن کاملMuSiC: a tool for multiple sequence alignment with constraints
SUMMARY MuSiC is a web server to perform the constrained alignment of a set of sequences, such that the user-specified residues/nucleotides are aligned with each other. The input of the MuSiC system consists of a set of protein/DNA/RNA sequences and a set of user-specified constraints, each with a fragment of residue/nucleotide that (approximately) appears in all input sequences. The output of ...
متن کاملMultiple Sequence Alignments with Regular Expression Constraints on a Cloud Service System
Multiple sequence alignments with constraints are of priority concern in computational biology. Constrained sequence alignment incorporates the domain knowledge of biologists into sequence alignments such that the user-specified residues/segments are aligned together according to the alignment results. A series of constrained multiple sequence alignment tools have been developed in relevant lit...
متن کاملAn Application of the ABS LX Algorithm to Multiple Sequence Alignment
We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...
متن کاملA Branching Alignment-Based Synthesis of Regular Expressions
We propose a novel Multiple Sequence Alignment algorithm which is able to build an optimized branching graph given a set of positive matching sample strings. The algorithm is principally based on Minimum Edit Distance approach being applied incrementally. However, we essentially extended the set of edit operations. The newly added operations allow implementing an acyclic graph drawing feature. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 35 شماره
صفحات -
تاریخ انتشار 2007